Picture for Xuefeng Bai

Xuefeng Bai

Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition

Add code
Feb 04, 2026
Viaarxiv icon

Decoupling Skeleton and Flesh: Efficient Multimodal Table Reasoning with Disentangled Alignment and Structure-aware Guidance

Add code
Feb 03, 2026
Viaarxiv icon

Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration

Add code
Feb 03, 2026
Viaarxiv icon

Beyond Rigid: Benchmarking Non-Rigid Video Editing

Add code
Jan 26, 2026
Viaarxiv icon

Character-R1: Enhancing Role-Aware Reasoning in Role-Playing Agents via RLVR

Add code
Jan 08, 2026
Viaarxiv icon

From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs

Add code
Sep 26, 2025
Figure 1 for From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs
Figure 2 for From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs
Figure 3 for From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs
Figure 4 for From Bias to Balance: Exploring and Mitigating Spatial Bias in LVLMs
Viaarxiv icon

Evaluating and Steering Modality Preferences in Multimodal Large Language Model

Add code
May 27, 2025
Viaarxiv icon

Handling Imbalanced Pseudolabels for Vision-Language Models with Concept Alignment and Confusion-Aware Calibrated Margin

Add code
May 04, 2025
Viaarxiv icon

MoK-RAG: Mixture of Knowledge Paths Enhanced Retrieval-Augmented Generation for Embodied AI Environments

Add code
Mar 18, 2025
Viaarxiv icon

Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation

Add code
Mar 13, 2025
Viaarxiv icon